Identifying causal serum protein–cardiometabolic trait relationships using whole genome sequencing

Abstract Cardiometabolic diseases, such as type 2 diabetes and cardiovascular disease, have a high public health burden. Understanding the genetically determined regulation of proteins that are dysregulated in disease can help to dissect the complex biology underpinning them. Here, we perform a protein quantitative trait locus (pQTL) analysis of 248 serum proteins relevant to cardiometabolic processes in 2893 individuals. Meta-analyzing whole-genome sequencing (WGS) data from two Greek cohorts, MANOLIS (n = 1356; 22.5× WGS) and Pomak (n = 1537; 18.4× WGS), we detect 301 independently associated pQTL variants for 170 proteins, including 12 rare variants (minor allele frequency < 1%). We additionally find 15 pQTL variants that are rare in non-Finnish European populations but have drifted up in the frequency in the discovery cohorts here. We identify proteins causally associated with cardiometabolic traits, including Mep1b for high-density lipoprotein (HDL) levels, and describe a knock-out (KO) Mep1b mouse model. Our findings furnish insights into the genetic architecture of the serum proteome, identify new protein–disease relationships and demonstrate the importance of isolated populations in pQTL analysis.

expression of the protein in a wide range of tissues. We find however that the two pQTLs (rs62013200 and rs2289702) colocalised with eQTLs in only eight common tissues, suggesting distinct regulatory mechanisms between tissues and emphasising a demand for future tissue and cell-type specific pQTL analyses to deepen our understanding of the genetic regulation of proteins.
For trans-pQTLs, colocalisation with eQTLs is one method used to map them to their causal genes. Sixty-one (72.6%) trans-pQTLs colocalised with an eQTL for at least one nearby gene in any tissue. We previously showed that testing a 2Mb region around the pQTL and using a stringent threshold for positive colocalisation (PP4>0.8) could detect 71% of causal genes for cis-pQTLs (2), and we expect that this number will be lower for trans-pQTLs due to reduced power as a result of smaller effect sizes. Future work will include combining eQTL colocalisation with additional methods such as literature mining and pathway analysis to confidently map causal trans genes.
Increased TYRO3 and CTSH levels are associated with an increased risk of DKD in individuals with type 1 or 2 diabetes, and reduced DLK1 levels are associated with an increased risk of DKD in individuals with T2D. CTSH is a cysteine protease of the family of cathepsins, whose role in kidney disease has been extensively studied (8), although CTSH specifically has not been associated with kidney disease. DLK1 is an inhibitor of Notch signalling (9), which is a pathway involved in numerous biological processes.
Variation in the DLK1 gene has been associated in previous studies with T2D (10) and glycated haemoglobin (HbA1c) levels(11), a biomarker for diabetes, while knockout mice have decreased lean body mass and circulating glucose and increased lean body mass (12).
Both variants reside within a long non-coding RNA (lncRNA) transcript, LOC157273, which regulates hepatic glycogen deposition and the expression of a large number of genes (13). Another novel cis-pQTL for SUMF2 is associated with decreased serum SUMF2 (opposite effect from the trans-pQTL), but shows no evidence of causality for the same traits.
PLAUR encodes the urokinase receptor (uPAR); uPAR and its ligand, uPA, were not significantly associated with rs4760 in our analysis. We find that the trans-pQTL colocalises very strongly (PP=98.2%) with a gene expression QTL (eQTL) for CADM4 (cell adhesion molecule 4), suggesting that the variant may influence TNFRSF10C levels through CADM4. Pathway analysis using the STRING database (https://string-db.org/) showed no direct interactions between the two proteins. Further experiments are required to confidently map the target gene.

Supplementary Note 3
Description of the genetic architecture of serum MEP1B Rs680321 is in LD (r 2 >0.8) with one missense variant (rs616114; MAF=0.40) and one splice region variant (rs335518; MAF=0.44). For the missense rs616114, the amino acid replacement (P695L) might lead to altered phosphorylation in close proximity to the Cterminus of MEP1B, which could impact the serum turnover of the protein (14). Rs616114 is also associated with expression of MEP1B in lung tissue (GTEx), suggesting protein regulation at the transcriptional level.

Characterisation and discussion of the Mep1b KO mouse
We carried out a target disruption of the catalytic centre of MEP1B, caused by a deletion of exon 7 of the wild-type allele interrupted by a neomycin resistance gene, resulting in a full body KO mouse model. We have previously shown that Mep1B absence does not result in embryonic lethality nor overtly altered phenotypes; however, evidence suggests that it leads to changes in kidney gene expression (15,16).
At the German Mouse Clinic, we systematically phenotyped Mep1B KO animals generated from heterozygous crossings. Monitoring body weight from age 9 to 19 weeks, we  Figure   5H-J), suggesting that changes in body composition stem from earlier age. As only female knockout mice were affected, we also investigated the sex-specific effect of the MEP1B pQTL, where we note a slightly stronger, albeit non-significant effect in females (Supplementary Figure 7).
Although an increase in HDL cholesterol levels has been found to be associated with decreased MEP1B protein concentrations in humans, we did not detect significant effects of the Mep1b knockout on plasma triglyceride or cholesterol levels in overnight fasted or ad libitum fed state mice (Supplementary Figure 6A, B). The ratio of HDL and non-HDL cholesterol was also not significantly altered in mutant mice (Supplementary Figure 6C).
Mep1b KO mice showed subtle alteration in iron metabolism-related parameters; namely, elevated plasma iron concentration and calculated total iron binding capacity (TIBC), while unsaturated plasma iron binding capacity (UIBC) was comparable for mutant and control mice (Supplementary Figure 6D-F). Increased renal transferrin receptor expression has also been observed in a previous study (16). Increased TIBC -a surrogate marker of transferrin levels-hints towards increased hepatic transferrin production, usually upregulated in response to intracellular iron deficiency, while ferritin production is downregulated under this condition (17). An association of cellular iron metabolism with the regulation of glucose metabolism and type 2 diabetes has been recently described (18,19).
Lipoprotein profiles in mice compared to humans show distinct differences. While low density lipoproteins comprise the major lipoprotein fraction in humans, HDL is the dominating lipoprotein in mice. The general composition of VLDL, LDL, and HDL lipoprotein fractions is similar in mice and humans (20); however, HDL can be further divided into subfractions in both mouse and man, with considerable differences in composition between species(21). These differences might account for the fact that no clear alteration of plasma HDL levels was observed in Mep1b KO mice. Indeed, an alteration of only one subfraction may be obscured by other unaffected portions of the HDL lipoprotein fraction. Further studies are required to analyse this in detail.